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ABSTRACT 


A Byzantine Music piece performed by a well recognized chanter is used in order 
to derive experimentally the mean frequencies of the first five tones (D — A) of the 
diatonic scale of Byzantine Music. Then the experimentally derived frequencies are 
compared with frequencies proposed by two theoretical scales, both representative of 
traditional Byzantine Music chanting. We found that if a scale is performed by a 
traditional chanter, it is very close in frequency to the frequencies proposed theoretically, 
except tone F. An allowed frequency deviation from the mean frequencies for each tone 
is determined. The concept of allowed deviation is not provided by theory. Comparing 
our results to the notion of pitch discrimination from psychophysics it is further 
established that the frequency differences are minute. The Attraction Effect is tested for a 
secondary tone (E) and the effect is quantified for the first time. The concept of the 
Attraction Effect has not been explained in theory in terms of frequencies of tones. 


CHAPTER 1 


INTRODUCTION 


Byzantine Ecclesiastical Music (or simply Byzantine Music) refers to the contemporary 
music used mainly in the Greek Orthodox and the Arabic Orthodox Church. Byzantine Music 
has its roots in the early Christian centuries. Borrowing the system, 1.e., the sequence of the 
intervals in a scale, directly from work done by earlier scholars such as Pythagoras and others, 
Byzantine Music has evolved throughout the centuries to the present day mainly by means of 
tradition. 

Since the early days of Byzantine Music different musicians and poem prayer writers 
have attempted to give musical notations that would guide the performers through their task of 
singing these poems in religious gatherings. Historically, the early Christians would gather 
together in secret places like catacombs and would chant all together some basic tunes. Most of 
these tunes would have no melody, let alone harmony, and be sung monophonically with some 
distinct rhythm, driven mostly by the prosodic intonation of the text. This form of monotonic 
diction would be a simplified version of what Homer, the ancient epic poet, would use in 
delivering his poems in public.” 

As time progressed, various poem writers would write poems that obey a melody based 
on specific modes used earlier by the ancients, thus standardizing some melodies and poems that 


would be uttered by all Christians together in the early gatherings. Most of these poems and 


* Plato often advised “the lyrics should regulate (guide) the music, and not the music the lyrics”. 


melodies have survived to this day and are performed primarily in the Greek Orthodox Church in 
the original ancient language. 

Early Christian poem writers aimed to write their poems and texts in general in a clear 
and common language (Alexandrian common or koine) that everyone would understand. The 
basic melodies used were meant mainly to aid the uneducated to remember the text rather than to 
please him acoustically. Through the years, however, these simple, basic melodies evolved into 
an art form; an art that mostly evolved and progressed during the years of the Byzantine Empire, 
hence the term Byzantine Music. The word “ecclesiastical” means pertaining to or related to 
church (<GR ecclesia=church, from ek (from) + kalo (to call)). It is used to distinguish from the 
music of the Byzantines outside the church, usually referred to as Byzantine Cosmic Music, 
cosmic meaning “for the people”. In this thesis the term Byzantine Music is used 
interchangeably with Byzantine Ecclesiastical Music. 

Byzantine Music as an art form is now performed by chanters, Orthodox Christians 
educated in music and language who can render the meaning of the text chanted according to the 
traditions of the Eastern Orthodox Church. Not only has the body of written genres, essays, 
poems and theological and philosophical work grown tremendously, but also Byzantine Music is 
now a complicated art having its own notation and musical scales (system), rhythm, tempo, kinds, 
modes and genders’ (Panagiotopoulou, 1981). 

It is important to understand the context in which Byzantine Music evolved into this 
contemporary form of art. Byzantine Music serves the purpose of connecting the faithful to 
his/her creator according to the traditions and standards of the Eastern Orthodox Church as 


defined by the Patriarchate of Constantinople. Within this context there are specific and well 


* The above are the “ingredients of melody” as listed in most of the standard theories on Byzantine Music. Whatever 
information is beyond the scope of this thesis will not be discussed here, but will rather be contained in the 
referenced citations. 


defined restrictions applied by the church to assure that no external elements protrude into the 
structure of Byzantine Music. This is why, for example, Byzantine Music enjoys a very simple 
form of harmony which basically consists of a continuous monotonic base tone sounded 
simultaneously with the melody. This ancient form of harmony will be discussed later; it is a 
possible explanation of why higher harmonics seem to possess greater amplitudes. 

Another consequence of the purpose that Byzantine Music serves is that Byzantine Music 
is strictly vocal. No instruments are allowed into the church to accompany the chanter’s voice. A 
Byzantine Music choir is assembled by men only, no female chanters are allowed unless the 
choir is serving an all-female monastery. The fact that Byzantine Music is designed for human 
voice has an inevitable effect on the way its scales are structured. For centuries instruments have 
been devised to produce music. Engineering techniques are put to the test every time a new 
instrument is made, say a new piano or a pipe organ. It is up to the musician to say if the 
instrument is performing the scales adequately or not. 

Throughout the history of music we have seen a plethora of scales that musicians and 
theorists came up with to solve harmony problems. This is not the case in Byzantine Music, 
however. The ear and brain (the inspection devises) are embodied in the instrument (the chanter), 
so that every time there is an inconsistency somewhere in the performance of the scale the voice 
adjusts its pitches in such a way as to satisfy the ear. 

There is an extended literature on how the scales came about. This is a matter of interest 
to the music historian. The key idea in the history of scales is that at different points in time 
someone revised an existing scale so that new instrument designs would perform different 
intervals without unpleasant dissonances. Byzantine Music, on the other hand, is totally vocal, 


and in effect there are no dissonances to be found in it, unless of course the chanter himself is 


* This is called “isokratima” (<GR ison (equal) + krato (to hold)) or simply “ison”. 


tone-deaf and his ear cannot check up on him and tell him that what his chanting is completely 
unpleasant! 

As a result we are today faced with a group of Byzantine Music scales that have been 
passed down from generation to generation, from teacher to student for more than two millennia. 
These scales comprise a variety of intervals, not only the Western tones and semitones. And 
through the years many scholars, mathematicians, musicians, physicists, etc., have attempted to 
quantify these various intervals in terms of numbers, i.e., to assign a numerical value to each 
interval. Accounts of this tendency to ascribe numbers to intervals can be found as far back as 
2500 ago; one needs only read through the work of Pythagoras and other ancient scholars in 
order to convince himself that this idea of quantifying the scales had been taken rather seriously 
for quite some time now. In this Thesis some background information will be given in order to 
illuminate the scope of the present research, but nothing will be discussed in depth as there are 
numerous excellent recourses for this sort of investigation (Backus, 1969, Benade, 1990, 
Chrisanthou, 1832, Jeans, 1968). 

For the rest of this research paper I have decided to use a terminology that will make the 
text easier to read by English speakers, in terms of Byzantine Music terminology. There is a 
variety of terms translated into English by different authors of Byzantine Music manuals and 
theories. Some of them I find counter intuitive and hard to remember. I will give the translations 
of words instead of simply the transliterations. For example I will say “kind” instead of “eidos” 
and “mode” instead of “echos”. This way the reader familiar with the original terminology will 
relate immediately and the reader who is less familiar (or not familiar) will have a term, which 
even though it does not translate the meaning, is remembered more easily. Definitions necessary 


to understand the context will be given as needed. I will also try to avoid repeating definitions 


where an alternative explanation can be given. Furthermore, some of the sentences given are 
translated directly from the ancient or modern Greek by the author. No individual references are 
given for definitions that are found in standard theory handbooks of Byzantine Music 
(Crisanthou, 1832, Panagiotopoulou, 1981). 

In the following subsections I briefly describe some of the theoretical themes of 
Byzantine Music that are necessary for understanding the remainder of this Thesis. Then I go on 
to describe the methodology and some necessary information to understand the signal processing 


analysis of the data. 


CHAPTER 2 


THEMES 


Notation 


Byzantine Music has its own notation derived from Greek symbols and alphabetical 
characters. Individual symbols are called characters and are divided in two categories: the 
characters of quantity and the characters of quality. The characters of quantity tell the musician, 
given an initial pitch, how many notes he should ascend or descend. The characters of quality tell 
the musician how to get to a specific tone and once he is there how to perform it. 

There are notation characters as old as the 5" century AD. The more complicated the 
melodies became, the more sophisticated the notation became. By the beginning of the 19" 
century Byzantine Music notation was comprehensive and complete. The notation used prior to 
1814 is known as the “old notation” as opposed to the new one in use today. The new notation 
was proposed by its three founders: Chrisanth, Bishop of Dirrachion (1843); Hourmouzios, the 
archive-keeper (1840); and Gregorios, the Arch-chanter (1822). The need for a new notation rose 
primarily because of the complexity of the old notation, which took some 15 years to master, 


perhaps because it was based on remembering long musical lines. The new notation given by the 


three fathers of the new system of notation is much more analytical and enables us to write a 
variety of new melodies impossible to write with the old notation. On the other hand, many argue 
that the reason the old notation was so cumbersome and largely based on memory is that by 
using fixed musical lines the ancient melodies are preserved better. 

The new notation, the one used today, takes three to five years to learn adequately. It is 
much more flexible than the old notation and allows musicians to elaborate and even analyze 
older manuscripts. How much elaboration should be done, is a subject much too sensitive to 
chanters and better left for another discussion. Due to the ability of the new notation to write new 
melodies and due to the dispersion of chanters to different geographical locations (thus being 
exposed to other musical stimuli), different Byzantine Music schools or waves were formed. One 
of the main points of disagreement between these schools is the musical scale intervals, the 
subject of the present research. A sample of the manuscript of the music piece used here is in 


Appendix A. 


Scales, Modes and Intervals 


There are four different main scales used in Byzantine Music. By different I mean 
different not in terms of what the frequency of the first tone of the scale is, but different in terms 
of the sequence of the intervals in each scale. For example, the European scale “C Major” is 
identical with “D Major” in terms of intervals within the octave. They both follow the interval 
sequence: 


T-T-S2PeT TS 


where T denotes a tone and S denotes a semitone. The same sequence is followed by any major 
scale and the only difference is in terms of the frequency of the first note. If we rearrange the T’s 
and S’s in the sequence above we will end up with the European minor scale, which has the 
following sequence: 

T-S-T-T-S-T-T . 
It is evident from the above drastically simplified discussion that major and minor (and for that 
matter any other kind of European scale) consists of tones and semitones. 

In Byzantine Music, however, there are no instruments used and to talk of a base line 
frequency is immaterial. The chanter chooses a convenient frequency to start his singing based 
upon his vocal expansion range. The sequence of the musical intervals in Byzantine Music 
defines what is known as scale. The word scale in Byzantine Music has a different meaning and 
is closely related to the system of Byzantine Music, 1.e., the sequence of the intervals within the 
given scale and not the frequency associated with the first tone of the scale. In other words, scale 
used in the Byzantine Music sense, means only Minor or Major, without any note next to it. 
Mode, on the other hand, is a set of rules that defines a distinct way of performing a piece of 
music. One of these rules is that one of the scales must be used mainly within a given mode. For 
example, “major” in European music language means — roughly — “plagios of the fourth” mode 
in Byzantine Music language. 

The four scales used in Byzantine Music are the diatonic, the two chromatics, and the 
enharmonic. Each follows a distinct sequence of intervals that include more than tones and 
semitones. It is beyond the scope of this paper to discuss any other scale (or its intervals) but the 


diatonic. The diatonic is used to perform the music piece from which we collected our data. 


The diatonic scale makes use of three kinds of intervals: the major interval, the minor 
interval, and the Jeast interval’. Here I will abbreviate them as “M” for major, “m” for minor and 
“Tl” for least. To be consistent with the definition of a scale given above, we put these intervals in 
a sequence so that we construct the diatonic scale. The diatonic scale is as follows: 

m-l-M-M-m-l-M 
Next we need to define exactly what the intervals are. So far most of the derivations of scale 
intervals have been based on arithmetical models proposed by musicians and scientists from 
different disciplines. Personally I strongly believe that assigning numbers to the above intervals 
is not as necessary as it looks for performing Byzantine Music. In other words, Byzantine Music 
would survive to this day exactly the way it has even if no one had ascribed numerical values to 
the intervals. This kind of vocal music is passed from teacher to student verbally for centuries. 

The first theorist to officially assign numerical values to these intervals was Chrisanth, 
the Bishop of Dirrachion, in 1814." Prior to that publication (Crisanthou, 1832), musicians relied 
solely on their “good ears” to learn the intervals. After the numerical values were published, 
virtually nothing changed. Byzantine Music was basically taught the same way after the new 
notation and after numerical values were assigned to the intervals as it was before. Then why do 
Byzantine Music theorists consider these numerical values so important? My guess is because of 
the western influence of the times. Around that time (early 1800’s) many great theorists were 
dealing with this new hot topic of finding the correct value for tones and semitones, so that 
European musicians would be able to play more harmonic chords. Even great mathematical 
minds (Euler) were puzzled by why a particular interval is pleasant or unpleasant to the musical 


ear. In Byzantine Music we use no chords and no instruments; therefore ascribing this or that 


* Their transliterations are meizon, elasson, and elahistos, respectively. 
' There have been some earlier attempts to assign numbers to the musical intervals, but here I will consider only the 
more formulated derivations suggested during the last two centuries. 


10 


value to an interval doesn’t really mean much to the young student who learns the scale for the 
first time. Whatever this young student hears coming out of his teacher’s mouth, that is the 
correct interval between two notes. The number in the book is something highly symbolic and 
will not change the way chanters perform these scales. This last idea would be of great interest to 
the psychoacoustician, but is too far from the scope of this paper for discussion here. 

Byzantine Music intervals were the center of discussions for a couple of centuries. Many 
influential theorists gave their suggested intervals and a portion of the Byzantine Music world 
followed this or that theory. Numerical intervals suggested were based upon a broad range of 
interpretations. In this paper we will focus our attention on two of the versions of these 
numerical intervals most accepted today. First we consider an older version proposed by 
Chrisanth in 1814 and then another proposed by the Patriarchal Committee of 1881. The 
Patriarchal Committee of 1881 was an official committee appointed by the church to “resolve” 
any inconsistencies the scales of Chrisanth may have had. It was composed of the most 
influential musicians of the time and most of them had been trained in European and oriental 
music, and in Byzantine Music as well. The new scale that was published by the Patriarchal 
Committee is widely accepted today and is used in many theoretical books as the standard one. 


These two scales are best understood visually and are drawn to scale in Figure 1 below: 
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Pa 


BYZANTINE NAMES 


EUROPE AN NAMES 
BYZANTINE NAMES 
=) 
- 
Q 
EUROPE AN NAMES 


Ga F 

Wu E 

Pa D 
Total: 68 Totak 72 


(a) (b) 
Figure 1 Two proposed versions for diatonic scale. (a) Chrisanth in 1814 (b) Patriarchal Committee of 1881. 
Numbers refer to atoms. 


From a first glimpse of Figure 1 the reader would notice that (b) is longer than (a). This is 
a consequence of drawing the scales according to the ratios of each tone to the total length. 
Furthermore, the scale in (a) is divided in 68 equal parts and (b) is divided into 72. We are going 
to call these parts atoms, because they represent the smallest non-divisible interval. No matter 
how many atoms from D to D the frequency ratio for one octave is 2:1, 1.e., whatever the 
frequency of D is, the frequency of D an octave above should be double. 

A comparison between the scales in figure 1 and the European tone and semitone is in 
order. Byzantine Music theorists think of the major interval of 12 atoms being the same as the 


tone of European Music. This was actually officially decided by the Patriarchal Committee of 


12 


1881. A semitone would then be 6 atoms. Therefore, the major interval (12) is roughly 
equivalent to the tone (12) in European Music, the minor is roughly %4 of a tone and the least is 
slightly greater than the semitone. Of course the above comparisons are rough estimates, because 
we need to consider not the numbers themselves alone, but ratios of the numbers to the total 
number of atoms for both cases. 


The minor interval from D to E in scale (a) is 


Chrisanth Patriarchal 
ascribed the number 9 and the total length of the scale Ae) ae es 
is 68, therefore for scale (a) a major interval has a ratio | Major 12 =~ 0.1765 12 = 0.1666 
of interval to scale of 9/68 or approximately 0.1323. 9 10 


Minor | — ~ 0.1323 | — = 0.1388 
68 72 


The minor interval from D to E in scale (b) is 10/72 or 7 8 
Least | — ~0.1029 | — ~0.1111 
68 iP 


approximately 0.1388, not very far from 0.1323. 


Similar ratios can be found for not only the intervals Table 1 Interval-to-scale ratio (Atoms). 
of the scales shown in figure 1, but also for other scales proposed by other Byzantine Music 
theorists. Values for the Diatonic scale are given in Table | for reference. 

The scales in figure one are so similar in terms of intervals, that are essentially the same. 
Why are there so many versions of essentially the same (acoustically) scale then? The above 
question is better answered with an example that pretty much is representative of most reasons 
that drove Byzantine Music theorists to revise its scales. Let us consider what led the Patriarchal 
Committee of 1881 to revise the scale to the one shown in figure 1(b). Remember that at the 
same time in Europe many physicists and other scientists were trying to solve the problem of 
dissonance for some intervals (thirds, fourths, fifths), thus coming up with different numerical 
intervals for the European scale. What became known as the Equal Temperament Scale, which 


basically consists of 12 semitones of exactly equal frequency ratio, was first calculated by the 
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French mathematician Mersenne (Harmonie Universelle, 1636) and J. S. Bach was the one that 
standardized it. In a nutshell, if the frequency ratio of a whole octave is universally accepted to 


be 2:1, then we can construct a scale of 12 semitones and the frequency ratio of a tone to its 


consecutive would be exactly '/2 ~ 1.059463094. This scale adjustment had a huge effect on 
the musical world of the eighteenth century as new and easy to play instruments were designed. 
The Patriarchal Committee of 1881 decided to equate the European tone to what they have 
known as the major tone that has 12 atoms. Then there are 6 atoms in a semitone and twelve such 
semitones make up a Byzantine Music scale of 72 atoms. So they decided to add an atom to the 
minor tone, they made it 10 instead of 9, and another atom to the least tone, which became from 
a tonal interval of 8 to one of 9 atoms. This way each tetrachord (special kind of a “fourth’”’) was 
increased by two atoms and with two tetrachords in an octave we have an increase of 4 atoms 
total. 

This scale revision, however, did not contribute much to the way Byzantine Music was 
taught or performed. The Equal Temperament scale was intended to solve a problem that 
occurred when music is played with instruments that produce a fixed frequency for a given tone. 
For the violinist playing without the accompaniment of other instruments or the chanter, who 
sings without instruments, there is limitation on whether they use the Just scale, the Gipsy scale, 
the Mean Tone scale, the Equal Temperament scale or any other scale for this matter; what is of 
essence is for the musical piece to be pleasant to the ear. Of course, the scope of Byzantine 
Music is not primarily to be pleasant aesthetically, but it is important for any music not to have 
dissonant intervals within its scales. Here hinges the purpose of this paper: to quantify the scales 


performed which are pleasant to the ear. 
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After this brief introduction to the Byzantine Music scale it is meaningful to describe the 
scope of this paper. Unlike the previous top-down approach, i.e., to find the numerical intervals 
by arithmetic means, here we attempt to determine the interval another way. Using music pieces 
performed by a well recognized performer, we try to find the intervals that he uses. This may 
sound a bit on the simplistic side at a first glance, but recall that the concept of musical acoustics 
was based primarily upon how pleasant or unpleasant two tones of different frequencies are 
when played together. In the past people used instruments to accompany their music. Because 
modern instruments (at least the ones we are accustomed) have limitations, we have often 
abandoned the aesthetical aspect of a piece of music that is not accompanied by instruments. 

From this perspective Byzantine Music is one of the very few forms of unaccompanied 
music that is still alive and performed today. That is where we are going to find authentic scales 
performed not according to instrumental restrictions, but according to what we (humans) defined 
as pleasurable to the ear before we had constrained ourselves in the mold of tones and semitones. 

I will conclude this section by quoting a very insightful comment by Sir John James, 
because I couldn’t have summarized it better myself (Science and Music, 1937, p.176). Here the 
author talks about the Equal Temperament scale, the one upon which European Music is based to 
this day, and some of its drawbacks: “The pianist and the organist accept this accumulation of 
lesser evils [that the Equal Temperament scale entails] in order to escape the major evils of badly 
discordant intervals. But the violinist and the singer are under no such necessity; as each interval 
comes along, they can make it what they like, and so naturally tend to make it that which gives 
most pleasure to the ear. Observations shew that the intervals which such performers produce 
when they are left to themselves differ greatly from those they produce when accompanied by an 


instrument tuned to equal temperament.” 
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The observations which Sir James Jeans speaks of are not to be found commonly in 
contemporary literature, but are well-known to musicians. With today’s advancements, however, 
in the field of computing and signal processing, collecting and analyzing such data won’t be as 
cumbersome as it was then. The next two subsections give some information on the Fourier 


analysis techniques used in this paper and the methodology. 


Fourier Analysis Techniques 


Acoustical signal processing refers to sound or vibration analysis, which extracts 
information critical for understanding the physical mechanisms underlying noise and vibration 
(Malcolm, J., C., 2003). In most cases the physicist is interested in the frequency of a signal, 
primarily because many sound behaviors such as propagation, emission, diffraction, and 
transmission are frequency dependent. Not only purely physical aspects of sound have 
frequency dependence, but also the animal (from insects to humans) sensation of sound is highly 
depended on frequency. With the increasing importance of media, antennas, underwater 
acoustics etc. the last few decades, acoustical signal processing has become an important tool for 
the physicist and the engineer. 

In this paper we will make extensive use of the Fourier Transform (FT) and the 
spectrogram. A transform is an equation that, given a signal, can provide a spectrum. An inverse 
transform is any equation that, given the spectrum, can recover back the signal. The two defining 
equations for a transform and its inverse transform are usually called transform pair. In general, 
there are many ways to define such transforms. Transforms used for theoretical representations 


and proofs usually require an infinite set of data. For example, Fourier transforms, Laplace 
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transforms, Fourier series, and z-transforms are widely used in proving theoretical relations in 
most standard textbooks. For finite data records, however, which are the case in real life 
applications, only finite transforms can be used. Among them one particular transform is of great 
importance to the physicist and engineer interested in real representations: the Discrete Fourier 
Transform (DFT). The DFT is based on a z-transform derivation (Crocker J. Malcolm, 2003) and 
is the most widely used tool for extracting spectra from finite length data. Since it was first 
introduced (Good, 1951), the DFT gained popularity quickly among the scientific community 
due to its applicability. Since then a computer algorithm was developed to solve DFT’s faster 
and more efficiently, known as the Fast Fourier Transforms (FFT) (...). The computation time 
for an FFT is substantially shortened especially in applications where long N-point signals need 
to be processed. In general, computational operations are proportional to Nlog2N calculations for 
an FFT as compared to N* computations for a DFT. 

There are various computer software programs for calculating DFT’s. For this paper 
Matlab® version 5.2.0.3084 with a Signal Processing Toolbox was used. When using software it 
is important to know the specifications of that software in relation to the application one wishes 


to perform, in order to avoid unwanted results. Matlab® defines the DFT as follows: 


N 
X=) adie: ore 1<k<N (1) 


n=l 


N 
x(n) = nD waa l<n<N (2) 
k=1 
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where X(k) is the DFT, calculated with a radix-2 FFT algorithm, and x(n) is the original discrete 
signal. The above two equations are the transform pair we will use in this paper. Equation (1) is 
the DFT and equation (2) defines the Inverse Discrete Fourier Transform (IDFT). 


The Fourier coefficients a and b associated with the signal x(n): 


x(n) =a, + Ya —e + bik) ORE) | (3) 
are given by: 
Pe 2X(1) 
N 
oe 2ReiXCe +1) . (4) 
b(k) = 2k XC +1)| 


where x(n) is the discrete signal sampled with a At time spacing. Matlab® uses the DFT in 
equation (1). A fast radix-2 fast-Fourier transform algorithm is used if the length of X is a power 
of two. “If the length of X is not a power of two, a slower non-power-of-two algorithm is 
employed. The above specifications came directly from the software’s help files; we quoted 
these specifications for reference. 

The term 1/N in equation (2) is a normalization factor that in the literature sometimes 
appears in equation (1) and sometimes in equation (2). The reason is that usually in practical 
computations of the DFT the signal processor usually needs to multiply his equation by a 
numerical factor that either adjusts the height of the output graphs, or normalizes to unit area, or 
normalizes to unity at origin etc. These factors usually are reduced to a single multiplication at 


the final state (Bracewell, 2000, p. 273). 
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The equations used by Matlab® are often adapted by electrical engineers and statisticians. 
The physicist feels more comfortable with another convention that solves the problem of 
dimensional analysis and numerical outcome all in one. If we use physical units, it can be shown 


that 


x = At- Af , (5) 
and the transform pair becomes 
N-1 
x(t) = oX(f)e™ NAF 0<t<N-1 (6) 
k=0 
N-1 
X(f) = Dox(He P™"/NAt O<f<N-l . (7) 


n=0 
Now not only the 1/N term is shown implicitly in the equations, but also the units will be 
consistent. For example, if the original signal had units of volts and An had units of time 
(seconds), then the DFT would have units of volts-sec or volts/Hz. If equation (5) is not 
incorporated in the transform pair, however, the term 1/N must appear (as in pair (1) — (2)). Here 
we will accept the analysis done by Matlab®. 

Acoustic signals are often classified as either deterministic or random. A deterministic 
signal is one that, as the word implies, has some deterministic or stable nature. Signals from 
periodic processes (rotations of a machine) or transient processes (a loud impact) are usually 
deterministic. Random signals are those that arise from complex sources of sound and are of no 
deterministic nature whatsoever (signal from a high speed air flow or a turbulent boundary layer) 
(Crocker J. Malcolm, 2003). Most real signals, no matter if they originate from an underwater 


natural source or human voice, contain both deterministic and random components. 
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Spectrogram is a term used widely to describe a two- dimensional plot of intensity vs. 
frequency and time. In this paper, since we are interested in frequencies, we will often employ 
the spectrogram. In most cases we will consider the same spectrogram plotted with different 
colors to reveal finer details. In other instances we will “zoom in” to take a closer look at the part 
that interests us more than others. 

Since we used Matlab® to compute the spectrogram some clarifications are in order. The 
Matlab® command for producing a spectrogram is: 

[B,F,T] = SPECGRAM(A,NFFT,Fs, WINDOW,NOVERLAP), 

where A is the vector to which the algorithm is applied, NFFT is the length of the DFT, 
WINDOW is any window the experimenter wishes to employ, and NOVERLAP is the 
percentage of overlap between windows. This function divides a long signal into windows and 
performs a DFT on each window, storing complex amplitudes in a table in which the columns 
represent time and the rows represent frequency. We can choose our own axes when it comes to 
plotting the results according to what kind of information we need to extract from the graph. Just 
as the DFT is applied to each slice, the window of our choice is applied to each slice. This may 
lead to potential misinterpretations, though, since windowing has some special effects on the 
outcome. 


Consider for example the rectangle function 


rect (n) = 1 0<k<N-l (8) 


=0 otherwise 


20 


to be the window function (or tapering function) for the infinite sequence x(n) that is the original 
signal. Then the windowed data discrete function — sequence — [x,(n)] is the multiplication of the 


signal by the window function 


X(N) = x(n) - rect(n) (9) 


After windowing an infinite signal we make the drastic assumption that anything outside the 
window is zero. The original signal changes from infinite duration to finite (truncation) and it is 
now modified by the window we chose to apply, in this case the rectangle window. Often the 
truncation of a signal will yield Gibbs Oscillations at the discontinuities of the boundaries or 
around any rapid change of the transform of the windowed signal. Gibbs oscillations are 
oscillations (decaying overshoot) about the high and low limits of the truncated function; they 
amount to 9% of a jump discontinuity [0.09(fx+) — fix], for a continuous-time function. The 
mathematics of this overshoot is now well understood and was first analyzed by Josiah W. Gibbs 
(1839-1903). Here we will be concerned about the effect of Gibbs Oscillations. 

The convolution theorem states that multiplication in the time domain is equivalent to 
convolution in the transform domain. Therefore, if we define X,(f), X(f) and sinc (f) to be the 


transforms of the discrete functions in equation (9), 


x, (n)< 2 > xX, (f) 
x(n)<—" » X(f) , (10) 


rect(n)<" > sin c(f) 


where 


21 


ines , ay 
mf 
then we can write 
X,, (f) = X(f) *sinc(f) g «(12y 


where the symbol [*] denotes convolution. 
Instead of using the sinc-function in equation (12) we can use what is known as the 
digital sinc or the Dirichlet kernel (Dy(f))(Marple,...), which is a scaled DFT for the discrete- 


time rectangle function: 


—-infT(N-1) sin(afTN) 


ls sin(nfT) 


(13) 


This windowed function is a version of the original signal but somewhat distorted. The width of 
the magnitude of the sharp impulses in the DFT will be broadened by the repeated shape of the 
window transform. The amplitude of neighboring frequency responses is influenced by the 
sidelobes around a transform peak (/eakage). This leakage depends of course on the kind of 
window one uses, how many windows are used (how many slices the software is going to divide 
the signal), etc. The number of slices, for example, depends on the window/FFT length and the 
signals length in the time domain. 

One way of determining if the bias factor is negligible or not is the following: Since 
Matlab® slices the signal and applies the window function and then the DFT on each such slice, 


we can apply first one window and then another to see the difference between the two. We can 


ZD, 


plot the difference as another spectrogram, since we are subtracting the difference of one 
spectrogram from another. This way we will be able to determine computationally and 
graphically the Gibbs Oscillation bias factor. If the bias is large we will apply some correction 
techniques, if it is not we will consider it negligible and no further action will be taken. This will 
be described below, after we present and explain our spectrograms so that the reader better 
realizes the outcome. Before we discuss how we interpret these graphs, a brief explanation is in 


order on how we selected our data. 


Scope 


As opposed to the earlier attempts of Byzantine Music Theorists (Chrisanth, 1814, 1832, 
Patriarchal Committee, 1881) to assign numerical values to the intervals of Byzantine Music 
scales based on mathematical manipulations of the frequency relations, here we attack the 
subject differently: based on performed pieces we attempt to derive the intervals of the diatonic 
scale of Byzantine Music. In this paper we will consider the intervals of only one of the scales of 
Byzantine Music, namely the diatonic. To our knowledge, this is the first attempt to derive 


frequency intervals from a performed Byzantine Music piece. 


Methodology 


Particular care was given in selecting the chanter who performed the music piece from 


which we collected our data. This is important because chanters who learned the music from 
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teachers outside the Byzantine culture use intervals that are closer to the Equal Tempered scales 
rather than Byzantine Music scales. 

The selected music piece is titled “Kyrie ekekraksa”, which means “Lord I cried [unto 
thee]” and was written by a well-known composer, Jacob the Arch-chanter (second half of ise 
century). The piece was written in the old notation and then translated to the new notation by the 
three inventors (see p. 5). The performance took place in Athens in 1975. The supervisor of the 
recording was Professor of Musicology at the University of Athens, Dr. Grigorios Stathis. The 
performer is Mr. Thrasivoulos Stanitsas (1910-1987), official Arch-chanter of the Ecumenical 
Patriarchate of Constantinople for several years. Mr. Stanitsas received his music education 
within and around Constantinople. He is considered one of the most classic Arch-chanters ever 
recorded. He is a representative of the old school of chanters (the Constantinople School), the 
only one that is officially accepted by the Patriarchate. We emphasize this, because although in 
this paper we do not compare intervals among different schools of Byzantine Music, this would 
be a significant potential research topic. 

We wanted to be able to pinpoint the frequency of tones D to A of the diatonic scale, and 
also to be able to make some inferences on the frequency deviation from the mean frequency that 
a chanter is allowed and still be correct. In other words, our samples had to be chosen so that 
after the appropriate analysis we can determine two factors: mean frequency of each tone, 
standard deviation from the mean frequency of each tone. This would enable us to define 
experimentally a frequency with a plus/minus allowed deviation from that mean frequency that a 


chanter is allowed without escaping from the boundaries of the defined tones. 


* Transliteration of Greek: “Kipte ¢xéxpaé«”, This and other Byzantine Music pieces may be found in the website 
of CYLLOGOS MOUSIKOFILON CON/POLEOS (Copyright © 2000) at http:/Awww.cmkon.org. 
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A software was used to literally enable us to cut off parts of music from the music piece. 
These small snippets are the bulk (steady-state) of a given tone and are usually of small time 
duration, between 0.1 to 0.3 seconds. For a given tone, say D, we used all these snippets 
concatenated together to construct a signal which can be manipulated by signal processing 
means. For all tones analyzed in this paper we used 20 snippets, except for tone E (40 snippets) 
and tone F (24 snippets). 

It is interesting to consider the pitch associated with each tone, i.e., perceived frequency 
by our ears; in a later section we will refer to some classic psychoacoustic studies. Of course this 
is something beyond this paper’s scope, but it is interesting to see after we determine the 
frequency intervals, if it is possible for our ears to detect such fine differences and, if it is, is it 
possible for human voices to perform such slightly different frequency intervals at will. 

A performer of a music piece eventually passes through all the notes of a given music 
scale. Then if we want to see what frequency we should assign to each tone’ we need to go into 
the piece and isolate one specific tone at the time, say E, and then concatenate all these E’s 
together, perform an analysis of some sort, and make our inferences from that analysis.’ These 
have been our main strategies throughout this paper for every tone of the scale. 

The piece was rerecorded to fit the sampling rate needs and other format requirements of 
our research using a standard PC microphone. Since we are primarily interested in the signal’s 
frequency, filtering was kept to a minimum and no software was used to reduce noise. Care was 
taken when rerecording the signal to minimize the external noise effects. No further filtering was 
done to the signal. “Isokratima”, i.e., another tone sung by a group of chanters simultaneously 
with the main melody, was not subtracted from the signal. This should not have an effect on the 


* The terms “note” and “tone” are used interchangeably in this paper. 
* Another approach is to analyze each occurrence of a tone separately and then apply statistical analysis to the result. 
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frequency of the melody (the part that we are interested in) so we left it there. The mean of the 
data was subtracted to reveal fine broadband details. Generally, even though there are many 
methods of making the graphs look neater, the data were not processed in any such way. 

When a music piece is recorded in a studio, various alterations occur upon digitizing and 
processing the piece in order to reflect the best audio quality possible. Now is as good a time as 
any to introduce another idea: pitch is not necessarily the same as frequency. Frequency is the 
physical term that says how many cycles a sound wave has per second. Pitch on the other hand is 
the sensation of such a quantitative attribute. It is known from psychoacoustics that pitch is a 
function of not only frequency, but amplitude and intensity as well (Shower, E., G., and 
Biddulph, R., 1931). Thus when you manipulate the musical piece to meet some audio standards 
you alter some physical aspects of the sound. Not only studio quality standards change the 
signal, but also the format the producer wants to save and use to distribute the musical piece has 
an effect on the signal. Audio compressions — to make a song that otherwise would take five 
CD’s space to fit in 1/10 of a CD — truncate some of the higher frequencies (> 10,000 Hz), for 
example. 

There is an extensive literature on microphones (Malcom J. Crocker, 1998, PART XVI) 
and in general computer hardware specifications dealing with analog-to-digital conversion. 
Usually professional microphones are calibrated from the manufacturer and come with a 
calibration curve that should be considered when considering the outcome. How sensitive the 
microphone is, has an effect on mainly the amplitude of the acoustical signal recorded and of 
course it has some effect on the frequencies of the signal. Since in rerecording the musical piece 
used for collecting data in this paper I didn’t use a professional microphone, we will need to 


make some calibrations of our own and see how much error we have. 
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Given better recording equipment and original, not processed data, one would fear less of 
falling into minor misinterpretations of the outcome. Nevertheless this paper solely looks into 
relative frequencies within the music piece and not if the performer agrees in frequency with 
other performers or instruments. Here we attempt to find the ratios of the frequencies themselves 
that construct the diatonic Byzantine scale. It is like taking a gramophone record that has been 
recorded for the standard frequency of 78 revolutions/min and playing it at 82.6378 rev/min. The 
whole piece would then sound about a semitone higher (because we multiplied the frequency by 
1.05946, the ratio of two adjacent frequencies that differ by a semitone), but it doesn’t make any 
difference to the listener. As long as the whole piece is elevated or diminished by a multiplicative 
factor (not additive) it doesn’t make any difference because the ratios among tones stand correct. 
Acoustically a trained ear may realize that the whole piece is a semitone higher altogether, but 
this doesn’t bother the listener; within that piece all harmonies are respected. This is what we do 
in this paper, in a way. We don’t compare the piece to another recording. We seek the ratios 
within this recording. 

Even though we do not wish to compare this music piece to another, we do need to rely 
on the graphical outcome in the sense that when the graph says that this tone has a frequency of, 
say, 440 Hz it really is 440 Hz. Tuning forks can be employed in finding how accurate our 
recording is. Because the frequency of the tuning fork is known, we can estimate an error of the 
microphone, hardware and even Matlab” program. Tuning forks are ideal for this kind of 
experimentation, in that they produce no overtones’ (given they are not stroked too hard). 

Another more straightforward way of estimating error in frequency is to have Matlab® 
generate a pure tone of some frequency, record it and feed the data in our program to see how the 


“In this paper we will refer to the normal mode with the lowest eigenfrequency (n=0) as the fundamental mode, i.e., 
the first harmonic. The second harmonic (n=1) will be referred to as the first overtone, etc. (Introduction to 
acoustics, ..., p.53). 


2] 


graphs come out. We can run both tuning forks and generated tones and see their differences. 
One thing to keep in mind is that the pure tone either generated electronically or by a tuning fork, 
was recorded in the same exact way: Mono recording, 16 bits, and with a sampling frequency of 
11025 samples/sec. 

The next subsection presents some of these graphs from generated pure tones and tuning 
forks. These graphs have been used as a rough calibration method of our hardware, software, 
apparatus, and programs. Because the same kinds of graphs are used to analyze the voice itself, 


we will spend some time talking about what each graph presents. 


Error Measurement 
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Figure l-a 
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Generated Pure Tone (440 Hz) 
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Figure 1-b 
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Figure 1-c 
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Figure 1-d 


This first figure shows the result of a Matlab” generated pure tone of 440 Hz. After the 
tone was created it was recorded with a microphone. Then we had Matlab® break it down into a 
data array (94611x1) and we performed the analysis using standard Matlab® commands as well 
as Signal Processing Toolbox commands. The sequence of the graphs has been arranged so that 
some of them are repeated magnified so subtle points can be seen and easily compared with the 
rest of the graphs of the same group. 

Fig. l-a shows the amplitude vs. time. As we said before, amplitude is of no particular 
importance when talking about frequencies, but we show it here for reference. The lower panel 


of fig. 1-a shows the frequency over time, what is sometimes referred to as the spectrogram. This 


* The program files (*.m files) used in this paper were mostly prepared and supervised by Dr. Juliette W. Ioup. 
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plot is of particular importance for us, because it shows how stable a tone is, i.e., how much it 
fluctuates about the frequency over time. Notice how smooth the pure generated tone is, as 
expected. The colors indicate amplitude with the “cooler” colors being the lower amplitudes and 
with the “hotter” colors indicating higher amplitude. On the side there is a scale with an arbitrary 
units showing the relative high and low amplitudes. We see that, for example, our generated tone 
did not fluctuate in amplitude, except maybe at the edges, and that was because I intentionally 
moved the microphone closer and away from the speaker at the beginning and end of the data 
collection time. As expected amplitude dissipates over distance (inverse square law). 

Fig. 1-b shows another spectrogram with different colors. Not only there are no 
overtones, as expected when dealing with a pure tone, but we can now see some finer details on 
the spectrogram. We see for example some faint lines close to zero. Again here colors imply 
amplitudes, so these fine lines below the fundamental are not likely to be hypo-fundamental 
overtones, but are rather the result of some noise picked up by the microphone of some 
resonance of the speaker etc. 

The lower panel of fig. 1-b shows the FFT (the Fast Discrete Fourier Transform) vs. the 
frequencies in our signal. Now that we are in the frequency domain we can see where the 
frequencies happen. We see two abrupt peaks at what appears to be around 440 Hz and some 
other ones closer to the origin. These smaller peaks closer to the origin are most likely the noise 
that showed up in the above panel. 

Fig. l-c is the same as fig. 1-b only magnified. Here we zoomed in to see finer details, 
even though it is not necessary here because our pure tone is so stable; this zoom-in graph will 


come handy when we need to see other signals that are not so simple in nature. Both graphs now 
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show the frequency to be at about 440 Hz. It is clearer than the previous graph that considered 
the whole frequency range. 

Fig. 1-d shows amplitude vs. frequency. The upper panel is over the whole length of 
frequencies and the lower panel is again zoomed in. From this lower panel we can clearly see 
that the frequency of our tone is indeed 440 Hz. The DC value at the origin (upper panel) is again 
most likely due to noise, like the same peaks in the previous figures. 

The lower panel of fig. 1-d looks as if it doesn’t have too many points across its curves; 
the curve looks kind of rigid. This is a result of windowing length. For example, I chose a 
window length of 4096 points (N) and a sampling frequency (Fs) of 11025 samples/sec. By 
equation (5) then At ~ 9.070310” seconds and Af ~ 2.691 Hz. This means that my resolution 
(Af) is the distance on the graph between two adjacent points. The best each point can resolve is 
Af. In case I wanted better resolution — more points within the same distance — I should choose a 
longer length for my window and FFT. Say I choose my N = 32,768 points (a 2" integer). Then 
Af ~ 0.336 Hz, a much better resolution indeed. There are some drawbacks, however. Not only 
the time is now longer (2.972 sec as opposed to 0.371 seconds, which is negligible in our case), 
but certain implications like higher leakage and differences in power may be observed. Not only 
mathematical differences, but differences in graphs may be seen also. For example, with such a 
long N, the program is slicing the signal and is applying the window. In other words, each 
window is so extended in length that is superimposed on the other lines of other windows, and 
they all look like one straight stable line, which is not the case. It’s a leakage-resolution tradeoff 
that the experimenter has to consider seriously. For this paper we kept the window size as 4096, 
and this will be the case everywhere, unless otherwise indicated. For comparison the last part of 


fig. 1-d is repeated with N = 32,768 points this time. Notice that as a consequence the amplitude 
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has grown with a factor of about ten and the peak is narrower (fig.1-e). Taking more data does 
give finer resolution of the transform in terms of the width of peaks, whereas padding with 
zeroes gives finer resolution to the graph of the transform without changing it. 

The mainstream method for dealing with undersampling or resolution problems is 
interpolation, if just increasing the number of points is not desired for some reason. There are 
numerous software programs that perform midpoint, polynomial interpolation etc. Here we 
decided that where applicable we will adjust the length appropriately to gain as much resolution 


possible with sacrificing as little resolution as possible. 
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Figure l-e 
Fig. 2 shows the same analysis as was done to a pure tone generated electronically, 
applied to a tone generated by a tuning fork. Notice the falling amplitude in fig. 2-a. Also notice 
how the frequency is constant, independently from the amplitude in the subsequent graphs. The 


last graph shows that the frequency of my tuning fork however is not exactly 440 Hz, but a bit 
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lower around 437.5 Hz. This is still fine because there are other factors involved in an actual 
tuning fork experiment, like temperature, tine separation etc. that would affect the outcome. 
Acoustically a 2 Hz difference doesn’t make any difference for the human ear. One can actually 
hear long low frequency beats when the tuning fork and the speaker produce their 2-Hz-appart 
tones, indication of some small difference in frequency. 

We have also conducted similar simple experiments using other three different tuning 
forks. Two of them were shown by the graph (a graph similar to 1-e) to be exactly at the correct 
frequency engraved on the tuning fork, and the third tuning fork analysis showed a frequency of 
about 1 Hz higher than the frequency indicated by the tuning fork’. 

From a first glance we can see that the problems we may encounter due to equipment are 
not restrictive for conducting this research. We do have noise and there are ways to eliminate 
most of these discrepancies, but this is beyond the scope of this paper. For our purposes we are 


confident that our results, as far as the frequency ratios go, will be trustworthy. 


* These three tuning forks were provided by the Lab Coordinator N. B. Day. 
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CHAPTER 3 


RESULTS 


In this chapter we present our analyzed data. First we start off with the first tone of the 
diatonic scale D (see figure 1) and we progressively go up the scale to G, a fourth (tetrachord) 
above. 

As seen in figure | above, the first fourth (from D to G) is repeated again from A to the 
high D. Remember that this is the diatonic scale based on Pythagoras’ scale of fifths and fourths. 
The two fourths are thought to be identical, so if we find the frequency ratios of the low fourth 
we can generalize this result to the high fourth. Notice the major tonal interval G-A that 
separates the two fourths; it is called the disjunctive tone of the diatonic scale and we will find its 
ratio later on. Comments on the various graphs are made where necessary for easier 
interpretation. 

The human voice does not behave like a piano string, but more like a violin string that 
first of all can play more than tones and semitones, and second can produce a voluntary vibrato 
effect on a tone. Even though the different tones that were extracted from the piece are of short 
time durations — ranging from 1/10 of a second up to a second — this vibrato effect is still there as 
expected. We will tolerate an error frequency interval of this vibrato effect. 

Another effect when dealing with human voices (especially in performing without 


instruments) is that tones are not expected to be exactly of the same frequency as the chanter 
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goes up and down the scale. We can call this human error, although I don’t like the term 
personally (even these small variations in frequency have there own musical meaning). I will call 
it attraction effect, i.e., the pull that the main tones of the mode exert on the secondary tones. 
This is another point this paper will briefly comment on. 

In Byzantine Music there are eight modes in total. Mode is another word for a set of rules 
applied to a piece to be written or performed. One of these rules is that a mode must have some 
main tones and some secondary tones. The main tones are consonant intervals with respect to the 
basis tone and the piece gravitates around them constantly. The secondary tones are tones in 
between main tones and are performed “on your way” to the main tone. So we expect the main 
tones to be more constant in frequency than the secondary tones. On the other hand, a given 
secondary tone would be slightly higher if it is performed on the way to a higher tone — if it is 
between a lower and a higher — and accordingly will be slightly lower in frequency if it is placed 
in between a higher and a lower tone. For the purpose of this paper D, F, G and A will be 
considered main tones and E will be the secondary tone. 

The higher tetrachord (tones B, C and high D) will not be analyzed for two 
reasons: first the piece we chose does not contain the high D tone and contains very few B and C 
tones. We did choose this piece, however, because it contains a plethora of samples for the rest 
of the tones, especially from tone D to G. It is important to collect all samples from the same 
piece. Secondly, it is universally accepted that the frequency ratio of an octave is 2:1. It would be 
nice to check if this is the case, but it is more important to have a sufficient number of sample 
tones. As we said earlier, the two fourths are thought to be identical, thus we will analyze only 
the lower tetrachord up to tone G and tone A (disjunctive tone). Then we will assume the higher 


tetrachord to posses the same frequency ratios for discussion peurposes. 


38 


The rest of this chapter will be divided into five sections, each dealing with a tone from D 


to A. The first section 3-1 discusses tone D. 
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SECTION 3-1 


TONE D 


Tone D is the tone where the “isokratima” is held by the group of chanters accompanying 
the solo chanter. Therefore we expect it to be stable, i.e., without fluctuations about the mean 
frequency, not only because it is a main tone, but also because of the accompanying chanters. 

Tone D is also the basis of the mode that the piece is written in, and therefore the musical 
piece starts and ends with this tone. For example, we have what is called the pro-echos , Le., a 
constant tone (the basis of the mode D) sung by the chanter at the beginning. This pro-echos has 
a considerable time length of approximately 3.6 seconds and as a signal it is also continuous and 
constant in the time domain. At the end of the piece we also have a long tone D known as the 
final termination". The first analysis on tone D will be to compare the first and last D tones 
(figures 3 and 4, respectively) and see if they agree in frequency. Then we will consider the 
concatenated D tones (figure 5) taken from the musical piece within pro-echos and final 


termination. 


* Term Translation: “Axnynue 4 HMpohy nue. 
* Term Translation: TeAvxy Kat&rrEvs. 
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Notice the stable frequency in figures 3-a and 3-c, even though the amplitude is growing 
smaller. Figure 3-d shows pro-echos to be at 283 Hz. The accepted value for tone D is 293.6 Hz 
(Jeans, 1968, p.22). Usually chanters, especially at recordings, use a tuning fork that gives the 
tone A at 440 Hz. Then the tone/basis is found according to the tuning fork, consequently such a 
frequency difference of the order of 10 Hz is substantial. Let us examine the final termination 


tone D and see if it agrees with the pro-echos tone D. 
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* All figures are numbered from a to d and even if we do not comment on some of them, we include them for 
comparison and reference. 
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The final tone D (figure 4-d) seems to be at 290 Hz, a value much closer to the expected 
one (292.6 Hz). Now we are faced with the dilemma of which of the two frequencies to accept as 
the frequency of the tone D. An interesting question, of course, is why the chanter starts with a 
frequency slightly lower than the one he finishes with. It is beyond the scope of this paper to try 
to answer questions like that, but an interesting experiment would be to find the correlation of 
tone memory with the ability to learn music. A possible explanation is that the chanter used his 
tuning fork to find tone D at around 290 Hz and then because he was left alone for some time to 
prepare for the recording, he adjusted that frequency to his own vocal needs or maybe he just 
voluntarily “forgot” the frequency. In our case, however, we do have a last resort, the 


concatenated D tones; their results are shown in figure 5. 
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Figure 5-a shows the amplitudes of the concatenated notes. Notice that the amplitudes 
where chosen not to be zero at any time. Also notice the amplitude change, a consequence of a 
real signal. This is because the singer doesn’t always sing with the same amplitude: he sings 
louder in some instances than others. Figure 5-b shows the spectrogram. In this graph we can see 
the fundamental (around 290 Hz) and two higher overtones. Surprisingly, some of the overtones 
seem to have higher amplitude. This is a consequence of what is known as the formant of the 
human voice. The air from the lungs is passing through vocal cords causing them to resonate at 
some frequency depending mainly on the mass and the tension on the cords. This sound then 
enters the vocal and nasal cavities before it reaches the mouth (another cavity), which 
continuously changes shape to produce different vowels and consonants. The final outcome — the 
voice — is the product of superimposed acoustical waves that resonate in these cavities. As a 
result, other frequencies are amplified and others are made softer. 

Figure 5-d, upper panel, shows what a formant would look like. It is basically the 
amplitude over frequency plot of a vowel of some frequency (usually a constant frequency). This 
envelope of amplitudes is what gives the timber of a person’s voice. It is the basis of voice 
recognition software and devices. It is also the reason why we can recognize a familiar voice 
without seeing the face, e.g., when we talk with somebody on the telephone. 

We will often encounter overtones that possess higher amplitudes in this paper. As we 
said earlier frequency is amplitude independent. On the other hand, amplitude may be higher at 
some point just because the singer is singing louder at that moment, or because of the formants. 
We reserve the discussion of pitch and formants for a later section. For the time being we will 


focus on the frequency of our signal. 
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From figure 5-b we see that we have overtones as high as about 3000 Hz (lower panel). If 
there were more overtones in the analogue signal and these were truncated due to studio 
processing and formatting, we do not know. If there were more overtones and they have been 
truncated, the richness of the voice is altered. The higher panel shows overtones up to 1000 Hz. 

The fact that we deduce the frequency of a tone by looking at the graph instead of using 
some algorithm to give us a more accurate numerical value may seem unprofessional to some 
extent. We reserve a more sophisticated method of analysis for future publication. For the time 
being, we think that the frequency resolution used in this experiment is enough, given the 
inability of the ear to resolve tones that differ by a small frequency ratio. This subject will be 
discussed in a later section. 

Based on figure 5-d we conclude that the frequency of tone D is 290 Hz. The similarity of 
the final termination D tone and the 20 concatenated D tones should not be surprising. After all, 
the frequency difference between pre-echos and the other two tones (~ 7 Hz) is not that big and it 


can be attributed to other factors. I reserve this topic for future research. 
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SECTION 3-2 


TONE E 


Next we proceed with note E of the diatonic scale. Tone E is our only secondary tone, 
thus we will consider two cases of E: one that the tone E is pulled up and another one that the 
tone E is pulled down. This attraction effect occurs according to the functional relation between 
the tones within a mode. Even though we will not discuss the theory of the attraction effect here, 
we will say that usually an upward pull occurs when a secondary tone is in between a lower and 
a higher main tone, and a downward pull occurs when a secondary tone is in between a higher 
and a lower tone. Here all the tone wave files (*.wav) labeled “up” or “dn”, for upward or 
downward pull respectively (see Appendix A), were selected carefully to represent the attraction 
effect. 

Figure 6 shows the graphs of a// E tones. Figure 7 shows the plots of only these tones 
from figure 6 that are pulled upward and figure 8 shows the plots of the rest of the tones of figure 
5 that are pulled downward. 

Figure 6 shows 40 different E tones collected throughout the piece. The first 20 E tones 
are pulled downwards by note D and the last 20 are pulled upwards by note F. Notice the 
fluctuation about the frequency in figure 6-a (lower panel). The probable explanation for this 
fluctuation (not seen in tone D above, or the tuning forks) is that these tones, even though are all 


E, are performed in slightly different frequencies (attraction effect). 
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The accepted value for E (Jeans, 1968, p.22) is 329.6 Hz. The fundamental must be then 
the frequency closest to 329.6 Hz, thus we consider the second line in figure 6-b (upper panel) to 
be the fundamental. A blown-up spectrogram is shown in 6-c. It is easy to see that the first half 
of the spectrogram is slightly higher than the second one, because the first twenty tones are 
pulled up in frequency. 

Figure 6-d is interesting. It shows the frequencies of the tone as if they were skewed to 
the right a little. It is not an upright curve like we have seen in D. The peak is on 320 Hz, then we 
see another local maximum on 310 Hz, and the bulk of the peak falls rather to the lower 
frequencies. Let us regard the frequency of this tone E as 313 Hz, since the peak is centered on 
this frequency. Next we will see all the E tones that are pulled up in frequency by F (figure 6). 

Figure 7 below shows the E tones of the diatonic scale that are subject to the attraction 
effect, pulled upwards. Notice how much more stable the spectrogram of the fundamental is in 
figure 7-c; it almost never touches 300 Hz (horizontal grid line). Also notice how stable all the 
harmonics are in figure 7-b, compared to 6-b above. The FFT length is still the same, and the 
axis of the graph are the same also. The stable nature of the frequencies on all the pulled up E 
tones hinges in the fact that now we are comparing similar tones, as opposed to figure 6 that was 


considering both kinds of E tones. 
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DD 


In other words, we are faced with two kinds of the same tone E. Some middle ages pianos 
and organs had additional keys to play these intermediate notes, but they were abandoned due to 
difficulty in performing such “over-keyed” instruments. This is one of the points this paper 
wants to make: voice performs without instrumental restrictions, yet no dissonance is present. 
There can be consonant intervals other than the equal tempered intervals. Byzantine Music takes 
advantage of its non-instrumental nature and to measure and report these intervals 
experimentally is important. 

Figure 6-d again shows the format (upper panel) of tone E with different vowels, hence 
the difference in amplitude over the harmonics. The lower panel of figure 6-d shows an upright, 
clear frequency of about 320 Hz for the fundamental. 


Next we consider all the E tones that are pulled down by D (figure 8). 
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Tone E (Attraction Effect - Pull Down - 20 Samples) 
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Figure 8 is the result of all the E tones that are pulled down by D. It was the most tedious 
tone to collect, because the singer usually voices this particular tone with a hyphenated leaning 
towards D. The total time length of this signal is approximately 3 seconds, the shortest of all 
signals in this paper. The number of tones is 20, the same as in the E tones pulled upward. 

In figure 8-a and 8-c we see how stable the signal is about the mean if we compare it to 
the one in figure 6. Figure 8-c shows the signal slightly lower (~ 300 Hz) than the one in figure 
wp 

The graph of interest is the one in figure 8-d, where we see again the frequency of the 
signal (lower panel). It seems not to be far from 305 Hz, some 15 Hz lower to the same E tone 
when pulled upwards. 

Table 2 summarizes section 3-2. It shows the frequencies of tone E pulled up by F or 
down by D and it also shows the frequency of the tone when both up and down pulled E tones 


are considered. 


Frequency 
(Hz) 
Pulled Up 320 
Pulled down 305 
Pulled up and Down 313 


Table 2. Tone E 
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SECTION 3-3 


TONE F 


Next we supply the data analysis for tone F. Tone F is a main tone, therefore we do not 
expect it to be subject to the attraction effect. Nevertheless, we collected F tones that were in 
between E and G and tones that were in between G and E. In other words, we followed the same 
procedure for collecting the attracted tones for E (the secondary tone) and we subjected them to 
the same analysis to see if the attraction effect takes place in the case of the F tone. 

The results are shown in figures 8, 9 and 10 below. Figure 8 shows all 24 samples 
containing the possibly pulled up and the possibly pulled down F tones. Figure 9 shows only 12 
samples of pulled up F tones and figure 10 shows only 12 samples of possibly pulled down F 


tones. 
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The harmonics in figures 8-a and 8-c seem to be stable, compared to figure 6. Since 
frequency is not fluctuating greatly about the mean, from a first glance it seems that the 
attraction effect did not take place in tone F case, as expected, because F is a main tone. 

Figure 8-d shows the frequency of tone F around 340 Hz. Compared to the accepted 
value of 349.2 Hz (Jeans, 1968, p. 22) this is not too far off. In order to be certain, however, that 
we do not have the attraction effect, we need to compare this value to the ones we get when only 


the possibly pulled tones are treated. Figures 9 and 10 below show these results. 
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Figure 10-c 
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Since we see no significant difference in all the three cases shown in this section, we 
conclude that tone F is not subject to the attraction effect and its frequency is around 340 Hz. 


We will generalize this result to the rest of the main tones considered in this paper. 
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SECTION 3-4 


TONE G 


Tone G is a main tone, therefore the attraction effect is not expected to occur. The 


accepted value for tone G is 392 Hz (Jeans, 1968, p.22). the results of our analysis are shown in 


figure 11 below. 
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The frequencies are stable as seen in figures 11-b and 11-c. Again figure 11-c shows the 
frequency of tone G. It seems to be at 392 Hz, but because the curve is slightly skewed with a 
longer tail to the left we will take the value of G to be 390 Hz, which is very close to the 


European tone G with frequency 392 Hz. 
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SECTION 3-5 


TONE E 


Tone E has an accepted value of 440 Hz (Jeans, 1968, p.22) and it was the basis of 
deriving the frequencies of the rest of the well-tempered scale. It is a main tone and most 
probably the tuning fork used in this performance was tuned to 440 Hz. 

As I said earlier, the music piece selected for this research has many lower tones (first 
tetrachord), but as one moves up the scale tones become scarcest. Tone A is no exception; for 
tone A we used 13 samples (snippets). The total time duration of the signal is about 2.6 seconds. 

Figure 12 below shows the results. Of particular interest is figure 12-d, which attributes 
tone A the frequency of approximately 440 Hz. This is pleasantly surprising if we consider that 
even an analysis on a tuning fork at 440 Hz (figure 2-d) did not yield such clean results. The tone 
A considered here is similar to figures 1-d and 1-e that show the electronically generated tone 
analysis. 


Consequently, we will take tone A to have a frequency of 440 Hz. 
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CHAPTER 4 


DISCUSSION 


Frequency Ratios 


In this part we deduce the frequency ratios based on our experimental data. Table 3 


shows all the results along with the frequency ratios. 


TONE FREQUENCY FREQUENCY 
(Hz) RATIO 
D 290 1 
Eexp 313 1.0793 
Edn 305 1.0517 
Eup 320 1.1034 
F 340 1.0625 (from Ex») 
G 390 1.1470 
A 440 1.1282 


Table 3. Frequencies and Frequency Ratios. 


The ratios in table 3 are of adjacent tones, i.e., from a variable frequency D one need add 
7.93 % to find E experimental. The experimental E tone (Eexp) is the one that was derived using 
both pulled up (Eup ) and down (Ean) E tones. 

When one performs a scale it is customary to sing from the first to the last tone of that 
scale and then descend from the last to the first. For the diatonic scale, for example, when the 
chanter sings the ascending part of the scale he performs tones E and B pulled up; accordingly, 
when he sings the descending part he performs the same tones pulled down. It is peculiar that no 


known theoretical textbook offers two different diatonic scales, one ascending and another 


74 


descending, even though it is well known that the attraction effect is alive and well established in 
the tradition of the music as well as in its every day use. 
Now we attempt to compare the frequency ratios obtained from the scales in figure l-a 


and 1-b. To obtain the ratios we make use of the following formula: 


C- log, = atoms > 

14 
Seif f, _ atoms Ve 
8 ry C 


1 


where ff is the final frequency and fj is the initial frequency and the two are adjacent. The base 2 
indicates the ratio of the octave (2:1) and C is a constant that denotes 68 atoms (as the total 
number of atoms in Chrisanth’s scale), or 72 atoms in the case of the Patriarchal Committee’s 
scale (figure l-a). The word “atoms” in equation (14) denotes number of atoms in the boxes of 
figure 1. Using 68 for figure 1-a and 72 for figure and the appropriate values for Major, Minor 


and Least tonal intervals we obtain the following frequency ratios for the two scales (Table 4): 


Chrisanth Patriarchal 

Committee 
Major 1.1301 1.1225 
Minor 1.0960 1.1010 
Least 1.0740 1.0800 


Table 4. Frequency Ratios for the two scales in figure 1. 


Using the ratios we can go back and calculate the frequencies of the diatonic scale for the 
two scales. Then we can compare directly to the frequencies that we found experimentally and 
see how close theory to practice is. We start from tone A at 440 Hz and we go down to low D 


and up to high D. Tones B, C and D in table 5 have been constructed using the frequency ratios 


i 


of the first tetrachord; these three last tones have not been processed experimentally due to their 
paucity of occurrence. The results are laid in Table 5. The numbers in italics indicate the 


frequencies not found experimentally, but calculated from the ratios of the first tetrachord. 


TONE Chrisanth | Patriarchal | Experimental 

Committee Results 

D 292.68 293.67 290 

E 320.78 323.33 320 (Eup) 

F 344.52 349.20 340 

G 389.35 391.98 390 

A 440 440 440 

B 482.24 484.44 485.50 

C 517.92 523.19 515.83 

D 585.31 587.28 591.66 


Table 5. Frequencies in Hz for the two scales in figure | along with our 
results. Only tones D to A have been experimentally analyzed. 


Notice that both scales use frequency values for E close to Eup, which suggests that the 
inventors of these scales considered only the ascending part of the scale and not the descending. 
Since it is traditional to give only one scale for both ascending and descending diatonic scales, in 
this paper we consider only Ey, as the correct value of tone E. Theoretical textbooks present only 
one scale, like the one shown in figure 1, as the diatonic scale. I have never come across a 
theoretical textbook that shows an ascending diatonic scale (with tones E and B pulled up) and a 
descending diatonic scale next to it (with tones E and B pulled down). It is left upon the guidance 
of the instructor and the good ear of the student. 

The frequencies of the two scales are closely related and are close to the experimental 
results, except in the cases of tone F. The largest difference in frequency between the two scales 
is about 9.20 Hz, between the Patriachal Committee’s tone F and the experimentally found tone 


F. Tone F was found substantially lower than both theoretical scales and we reserve this finding 
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for future research. For the time being, there seems to be no plausible explanation I can give to 
justify this difference of tone F to both theoretical scales. It could be due to physiological reasons 
or psychophysical reasons that take place around that specific frequency per se, or it could be a 
misjudgment of both theoretical scales. We reserve this topic for future research. 

Chrisanth’s scale seems to be a bit closer to the experimental scale, but again, the 
frequency difference is so minute it doesn’t make much difference to talk about one being closer 
to the experimental scale than the other. 

We can also express the experimental frequencies of Table 5 in terms of atoms for 
readers that are more comfortable with atoms than frequencies. Since we already have the atoms 
for the two scales in figure one, we need only find the atoms for the experimentally derived 
scale. For Chrisanth’s scale which has a total number of 68 atoms (figure 1-a) the minor tone has 
9.6573 atoms (D — E interval), the least has 5.9475 (E — F interval) and the major has 11.8340 (G 
— A interval). For the Patriarchal Committee’s scale which has a total number of 72 atoms (figure 
1-b) the minor tone has 10.2254 atoms (D — E interval), the least has 6.2973 (E — F interval) and 
the major has 12.5301 (G— A interval). These results compared with the atoms shown in figure 1 
also suggest that the differences are minute within the context of pitch discrimination (see p. 79). 

Now we turn our attention to another issue: how much a performer is allowed to deviate 
from these theoretical frequencies. So far we have shown that Mr. Stanitsas is very accurate in 
performing the theoretically proposed intervals. This argument, of course, is better put in context 
the other way around. Since chanting relies mostly on tradition and since Mr. Stanitsas is one of 
the chanters most representative of traditional singing, the theorists did well in assigning the 
correct number of atoms to each interval. But the theoretical scale doesn’t tell us how much you 


can deviate from a given tone and still be performing the diatonic scale correctly. For this reason 
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we found the mean, standard deviation and the variance of each tone experimentally using the 
individual occurrence of each. 

Here we make a distinction between what we have been presenting so far (plots) and 
what we are about to present as the mean of a given tone. So far we have pinpointed the 
frequency of each tone based on the graph on amplitude vs. frequency of the transform of the 
concatenated occerrences. Different samples or “snippets” have different time length and 
therefore different number of points (N). When we did the FFT (and the spectrogram) we used a 
window on the concatenated data and an FFT length of 4096 points. The two lengths were the 
same and the window that was applied to each snippet didn’t center on each snippet perfectly. 
This could lead to potential leakage, even though it is going to be undersized. 

The other aspect of using same window/FFT length is that the mean average frequency is 
taken over the whole number of snippets. This means that longer snippets are weighted more 
than the shorter ones. In other words, if we have to average two sound signals, one 1 second long 
and the other 20 second long, the later will affect our mean more than the one-second-long 
signal. 

We can take each snippet’s data matrix and multiply it with a window of the same length 
as the data matrix, then take the FFT of that product (multiplication in time domain is the same 
as convolution in the transform domain) and find the peak of that snippet, i.e., its frequency. The 
longer the FFT length, the more padded zeroes we have and the more accurate is the mean 
frequency of that snippet. Then we use these frequencies of the, say 20, different snippets of the 
same tone to find the mean, standard deviation and variance of a given tone. The results are 


shown in table 6 below. 


TONE Chrisanth | Patriarchal | Experimental | Snippet Mean 
Committee Results and SD 
D 292.68 293.67 290 290.27+3.04 
Eup 320.78 23.03 320 (Eup) S20: 3725.57 
Ean 305(Ean) 309.19+4.61 
eae 313(Eexp) 314.78£7.59 
F 344.52 349.20 340 341.04+4.96 
G 389.35 391.98 390 389.20+6.61 
A 440 440 440 440.71+7.20 
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Table 6. Two theoretical values and two experimental values. (SD: Standard Deviation). 


The window length was the same as the data (snippet) length and the FFT length has 
16384 samples. This gives a Af ~ 0.6729 and that is why I keep the numbers in table 6 in two 
decimal places. Because the snippet length was always significantly smaller than the FFT length, 
zero padding was applied. 

The first comment on table 6 is that the mean frequency obtained from averaging 
all snippets individually is remarkably close to the value obtained over all realizations. This is an 
indication of the fact that the concatenated data did not have long snippets far from the mean. 
Notice, for example, that tone A still possesses its approximately 440 Hz frequency. With the 
exception of Egn, all other tones are comparatively very close in frequency. How close is close 
enough we will consider below. First we introduce the necessary basis for our research concept 


of pitch discrimination and we then proceed to connect it with the standard deviation. 


Pitch Discrimination 


In the subsection above we found the differences in frequency and we considered them 


negligible. In chapter 3 we said on various occasions that the frequency difference is not of an 


order that should alarm us as significant. For example, the pro-echos was lower from the final 
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termination tone D by 7 Hz, the two scales differ from a fraction of a Hz up to 5 Hz etc. How 
much is significant or negligible for this research is determined by how much the human ear can 
resolve. After all, in this paper we are dealing with vocal music, and if a performer cannot 
differentiate two tones that physically differ in frequency he cannot reproduce this 
undistinguishable interval. In this section we briefly discuss the psychoacoustic aspect of pitch 
discrimination. 

Pitch is the psychological aspect of sound sensation related to the physical characteristic 
of a sound, namely its frequency. A sound has a frequency (physical characteristic), but if there 
is no one there to hear it doesn’t posses pitch. Pitch is the physiological aspect of an acoustic 
wave. Work on pitch discrimination was done extensively in the 1920’s and 1930’s, even though 
experiments had been conducted since the 1880’s. One of the most sited references is that of 
Shower and Biddulph (1931), which is described as the most elegant and controlled study on 
pitch differentiation (Gulick, 1971). 

The results show that for frequencies between 125 and 2000 Hz, at a comfortable 
sensation level of around 40 db the human auditory resolution (Af) is about 3 Hz (Gulick, 1971, 
p.126). Some authors refer to this Af as the just noticeable difference or jnd for short. There are 
other ways one can express this finding (Af/f, etc.), but for our purposes the above statement is 
sufficient. 

Throughout this research we came across frequency differences of 5 and 7 Hz. Even 
though the trained ear can hear a minute difference under ideal conditions, these differences are 
certainly negligible. Needless to say that any differences less than 3 Hz are nonexistent for our 


ears. 
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Within this context of the jnd, a standard deviation equal to jnd, 1.e., 3 Hz, is absolutely 
justifiable; we cannot expect the performer to perform differences that cannot even be heard. 
Tone D for example, is the one with a standard deviation of about 3 Hz, so the chanter was 
performing it every time very close to the mean value of tone D (~ 290 Hz). This is not surprising 
since the chanter had the group of accompanying chanters to “remind” him exactly where tone D 
was (isokratima). By the way, inferences like this one (based on separate snippets) could not be 
done with the mean frequency obtained from all realizations considered, that is why we averaged 
the snippets independently. 

Standard deviation is a way to define how much a performer can deviate from the 
indicated theoretical or experimental mean frequency value of a given tone. Since theoretical 
scales do not provide us with the acceptable frequency deviation for a performance to be 
accounted as a correct performance interval-wise, we resort to finding this acceptable deviation 
by experimentation. The more reliable (recognized) the subject (chanter) the more reliable our 
results would be. 

Standard deviations in table 6 vary from about 3 to 7 Hz, therefore we can suggest that a 
chanter that deviates by about 2 jnd’s from the mean frequency of a tone is of the class of Mr. 
Stanitsas, one of the most recognized chanters. 

Is the above acceptable deviation of 2 jnd’s the same for main and secondary tones? 
Notice that standard deviations are not smaller for main tones and therefore a very good chanter 
should deviate from the mean frequency tone not much more than 2 jnd’s. 

Is readily seen from table 6 that 2 jnd’s is the acceptable deviation for both main and 
secondary tones. Since we have already established that attraction effect is a real effect naturally 


occurring in chanting, we conclude that secondary tones subject to the attraction effect are 
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methodically and systematically reproduced, not by chance, but intentionally. The attraction 
effect is a voluntary act of performing Byzantine Music scales according to tradition. Since 
secondary tones usually produce dissonant intervals with respect to the basis of the mode, the 
chanter must be rather skillful to achieve such an admirable consistency as that of Mr. Stanitsas. 

In this last part of the discussion section I will provide evidence that a rigorous 
mathematical treatment of atoms is not necessary for implementing better interval performance. 
This will be tied with the fact that Chrisanth may have not treated the subject of atoms extremely 
scientifically, because he believed that scales are correctly learned given two facts: the teacher 
knows how to perform the scales correctly and the student has the ability to learn. 

Chrisanth stated that the student should learn from a “Hellenic chanter” and not another, 
because he himself was acquainted with oriental, European, and BM and he knew that other 
musicians use different intervals. He also stated that “on the schematic representations of the 
scales [like figure 1], where intervals have numbers of 11 or 13 [atoms], due to the 
undistinguishable of the unity, they are called major tones [12 atoms] (see introduction p. 18).” 
In other words he defined 1 atom as his allowed deviation. Let’s check this using equation (14). 
Because we need a frequency ratio that correspond to frequencies that differ by 3 Hz (jnd) we 
will start from tone D which has 290 Hz and let us see how many atoms is the ratio 293/290. 
Surprisingly this ratio corresponds to 1.0096 atoms. Chrisanth was correct on defining his jnd 
intuitively. What about at an octave higher than 290 Hz? The ratio is 583/580, which 
corresponds to 0.5061 atoms. At the lower limit, let us check around 200 Hz, a sound that very 
seldom if ever used in chanting. The ratio 203/200 corresponds to 1.5465 atoms. 

This last remark will have a special appeal to the reader acquainted with BM. Of course, 


if we find the atoms according to the Patriarchal Committee’s total number of 72 atoms, the 
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corresponding atoms for the jnd in the various limits of frequency tones are very close to the 
ones of Chrisanth, as expected. But the Patriarchal Committee never said anything about allowed 
variations. They devised an instrument to accurately and rigorously determine the intervals, as if 
they could be performed in such an accurate way. 

Chrisanth totally intuitively and long before the jnd was established experimentally gave 
an allowed deviation from the mean frequency approximately equal to jnd. On a historical note, 
the first scientific attempt to define jnd some 70 years after Chrisanth’s publication, found jnd to 
be around 0.2 Hz (Luft, 1888), much further from the correctly accepted value than Chrisanth’s 


intuitive guess. 
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CHAPTER 5 


CONCLUSIONS 


The result that both scales are approximately equally close in frequency to the 
experimental results suggests that neither is better in imposing how the intervals of the diatonic 
scale should be performed. By “equally close” we mean that even though both scales differ from 
the experimental results, these differences are negligible and occur in both scales. The above 
argument is better put in realistic context reversely: scales are accurate portrayals of culture. 

We experimentally found tone F substantially lower than the tone F proposed by the two 
theoretical scales. We reserve this topic for further investigation. 

How accurate a performance is better realized when observed within the context of two 
factors: how close are the experimental results to the theoretical scales and how consistent the 
chanter is within the piece. The experimentally derived scale is close to both theoretical scales 
considered in this paper, and the chanter is consistent throughout the piece. To examine the first 
factor we considered an FFT weighted average. For the second factor we used the individual 
snippets to determine the average and standard deviation. 

When standard deviation is tied in context with the notion of jnd’s we define an 
experimental allowed deviation from the mean, not provided by theory. This result can be used 
in applications like determining how good is a good performance, according to other 


performances used as standards. 
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Representing scales schematically using atoms is no more than a visual aid, since both 
approximate the experimental results well. A chanter trained by a traditional teacher will perform 
traditional intervals no matter which scale he is asked to perform. This may not be the case with 
a singer trained in the well-tempered scale of European music, or a chanter who was trained 
outside the Byzantine music tradition. 

Crisanth’s approach to atoms being not as rigorous as that of later theorists is justifiable. 
Determining intervals in a meticulously mathematical manner does not ensure better 
performance or a better way of teaching the Byzantine Music scales. 

The attraction effect occurs regularly in performing the diatonic scale. Theoretical 
textbooks choose to represent the diatonic scale using only the ascending portion of it. Pulled 
tones are taught traditionally. Theorists speak of the attraction effect in a more general form, not 
indicating the effect on the schematic representations of the scales. 

Secondary tones have no more standard deviation than main tones. This suggests that the 
attraction effect is not only a well established phenomenon in the Byzantine Music tradition, but 


it is intentional and hard to achieve. 


Suggested Future Research 


In this paper we considered only one scale (diatonic), two proposed theoretical scales of 
the same school (traditional) and one performer as sample (Mr. Stanitsas). Therefore our 
conclusions are limited within the above boundaries. 

For future research we will analyze the remaining three scales of Byzantine Music (two 


chromatic scales, and an enharmonic) and determine the means and standard deviations. We will 
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do this using a traditional performer and another performer representative of the new movement 
in Byzantine Music circles. Then we will compare the traditional interval with the modern 
interval and see how different they are. We will determine if the modern movement intervals are 
closer to traditional Byzantine Music intervals or closer to European intervals. 

We can use different performers representative of the same school (traditional or modern) 
and see how close they are to each other. We will try and determine if the differences within 
movements are the same as differences in between movements. If not we establish 
experimentally that differences exist and we can — to some extent — quantify these differences. 

We will see if performers like Mr. Stanitsas are as consistent with other scales as he is 
with the diatonic. This way we may conclude that some scales are “easier” to perform than 
others. 

Determine if tone F is performed lower than the proposed by the theoretical scale value 
not only using one music piece performed by one chanter, but using several pieces performed by 
other representative chanters of both movements (traditional and modern). It is believed that 
European vocalists prefer the Just scale over the well-tempered scale. A further investigation will 
show if tone F performed lower is closer to the Just scale, and thus, closer to the preference of 
European vocalists. 

We can conduct experimental work not only with Byzantine Music intervals, but also 
with the way special characters are performed (qualitative characters). And within this idea we 


can again check across movements, scales and performers. 


Pulver 
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APPENDIX A 


Appendix A contains audio files (*.wav) that will help the listener better relate with the 
research and especially the Methodology section. First we deposit part of the music piece used to 
extract the snippets. Then we provide the concatenated snippets as one can hear them when 
running the appropriate Matlab® *.m file. The numbers (names of wave files) correspond to the 


description given below. 


J) 


1. Part of Music piece by Mr. Stanitsas. 


® 


2. Concatenated D tone. 


3. Concatenated E tone with all 40 samples. 


4. Concatenated F tone with all 24 samples. ty 


5. Concatenated G tone with 20 —_ 
6. Concatenated A tone with 20 samples. 


® 


7. Pro-echos. 
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APPENDIX B 


Appendix B shows the manuscript of the music piece performed by Mr. Stanitsas. Red 
characters indicate tone D, but do not necessarily correspond to the correct order of snippets of 
tone D. Only the first few are shown here in red. This piece was composed by Jacob the arch- 


chanter and it was kindly provided by Dr. Nick Giannoukakis as a handwritten manuscript. It 


was typed by the author. 
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